Analysis of segmental duplications and genome assembly in the mouse.
نویسندگان
چکیده
Limited comparative studies suggest that the human genome is particularly enriched for recent segmental duplications. The extent of segmental duplications in other mammalian genomes is unknown and confounded by methodological differences in genome assembly. Here, we present a detailed analysis of recent duplication content within the mouse genome using a whole-genome assembly comparison method and a novel assembly independent method, designed to take advantage of the reduced allelic variation of the C57BL/6J strain. We conservatively estimate that approximately 57% of all highly identical segmental duplications (>or=90%) were misassembled or collapsed within the working draft WGS assembly. The WGS approach often leaves duplications fragmented and unassigned to a chromosome when compared with the clone-ordered-based approach. Our preliminary analysis suggests that 1.7%-2.0% of the mouse genome is part of recent large segmental duplications (about half of what is observed for the human genome). We have constructed a mouse segmental duplication database to aid in the characterization of these regions and their integration into the final mouse genome assembly. This work suggests significant biological differences in the architecture of recent segmental duplications between human and mouse. In addition, our unique method provides the means for improving whole-genome shotgun sequence assembly of mouse and future mammalian genomes.
منابع مشابه
Enrichment of segmental duplications in regions of breaks of synteny between the human and mouse genomes suggest their involvement in evolutionary rearrangements.
The sequence of the mouse genome allows one to compare the conservation of synteny between the human and mouse genome and exploration of regions that might have been involved in major rearrangements during the evolution of these two species (evolutionary genome rearrangements). Recent segmental duplications (or duplicons) are paralogous DNA sequences with high sequence identity that account for...
متن کاملSegmental Duplications as a Complement Strategy to Short Tandem Repeats in the Prenatal Diagnosis of Down Syndrome
Background: Quantitative fluorescence-polymerase chain reaction (QF-PCR) is an inexpensive and accurate method for the prenatal diagnosis of aneuploidies that applies short tandem repeats (STRs) as a chromosome-specific marker. Despite its apparent advantages, QF-PCR is not applicable in all cases due to the presence of uninformative STRs. This study was carried out to investigate the efficienc...
متن کاملMis-Assembled “Segmental Duplications” in Two Versions of the Bos taurus Genome
We analyzed the whole genome sequence coverage in two versions of the Bos taurus genome and identified all regions longer than five kilobases (Kbp) that are duplicated within chromosomes with >99% sequence fidelity in both copies. We call these regions High Fidelity Duplications (HFDs). The two assemblies were Btau 4.2, produced by the Human Genome Sequencing Center at Baylor College of Medicin...
متن کاملRecent segmental duplications in the human genome.
Primate-specific segmental duplications are considered important in human disease and evolution. The inability to distinguish between allelic and duplication sequence overlap has hampered their characterization as well as assembly and annotation of our genome. We developed a method whereby each public sequence is analyzed at the clone level for overrepresentation within a whole-genome shotgun s...
متن کاملGenome-Wide Signatures of ‘Rearrangement Hotspots’ within Segmental Duplications in Humans
The primary objective of this study was to create a genome-wide high resolution map (i.e., >100 bp) of 'rearrangement hotspots' which can facilitate the identification of regions capable of mediating de novo deletions or duplications in humans. A hierarchical method was employed to fragment segmental duplications (SDs) into multiple smaller SD units. Combining an end space free pairwise alignme...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Genome research
دوره 14 5 شماره
صفحات -
تاریخ انتشار 2004